SJTU at TREC 2004: Web Track Experiments
نویسندگان
چکیده
Yiming Lu, Jian Hu, Fanyuan Ma ( Department of Computer Science & Engineering , S hanghai Jiaotong University , S hanghai 200030) {luyiniao , hujian , ma-fy}@sjtu.edu.cn Abstract: This is the first year our lab to participate in Trec. We participate in Mixed-Query task for the Web track. All the runs we submitted are based on the modified Okapi weighting scheme. Besides, we used several heuristics as the re-rank method: site-merging, minimal span weight, and etc. Also, the PageRank of a document is combined with the similarity of the document with the query to obtain an overall ranking of documents. Especially for the mixed-query task, we try a simple classification method to estimate whether the query is topic distillation or entry-page finding.
منابع مشابه
Novel Approaches in Text Information Retrieval - Experiments in the Web Track of TREC 2004
In this paper, we report our experiments in the mixed query task of the Web track for TREC 2004. We deal with the problem of ranking Web documents within a multicriteria framework and propose a novel approach for information retrieval. We focus on the design of a set of criteria aiming at capturing complementary aspects of relevance. Moreover, we provide aggregation procedures that are based on...
متن کاملOverview of the TREC 2004 Web Track
This year’s main experiment involved processing a mixed query stream, with an even mix of each query type studied in TREC-2003: 75 homepage finding queries, 75 named page finding queries and 75 topic distillation queries. The goal was to find ranking approaches which work well over the 225 queries, without access to query type labels. We also ran two small experiments. First, participants were ...
متن کاملTREC 2004 Web Track Experiments at CAS-ICT
This report presents CAS-ICT’s experiments on the Mixed query task of the TREC2004 Web track. Our work focused on combining different Web page evidences together to improve the overall retrieval performance. Four kinds of evidences, including body content(C), anchor texts (AT), basic structural information (S0) and extended structural information (S1) were considered for retrieval. Six combinat...
متن کاملRMIT University at TREC 2004
RMIT University participated in two tracks at TREC 2004: Terabyte and Genomics, both for the first time. This paper describes the techniques we applied and our experiments in both tracks, and discusses the results of the genomics track runs; the terabyte track results are unavailable at the time of manuscript submission. We also describe our new zettair search engine, in use for the first time ...
متن کاملIndri at TREC 2004: Terabyte Track
This paper provides an overview of experiments carried out at the TREC 2004 Terabyte Track using the Indri search engine. Indri is an efficient, effective distributed search engine. Like INQUERY, it is based on the inference network framework and supports structured queries, but unlike INQUERY, it uses language modeling probabilities within the network which allows for added flexibility. We des...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004